Structured sparsity with convex penalty functions
نویسنده
چکیده
We study the problem of learning a sparse linear regression vector under additional conditions on the structure of its sparsity pattern. This problem is relevant in Machine Learning, Statistics and Signal Processing. It is well known that a linear regression can benefit from knowledge that the underlying regression vector is sparse. The combinatorial problem of selecting the nonzero components of this vector can be “relaxed” by regularising the squared error with a convex penalty function like the l1 norm. However, in many applications, additional conditions on the structure of the regression vector and its sparsity pattern are available. Incorporating this information into the learning method may lead to a significant decrease of the estimation error. In this thesis, we present a family of convex penalty functions, which encode prior knowledge on the structure of the vector formed by the absolute values of the regression coefficients. This family subsumes the l1 norm and is flexible enough to include different models of sparsity patterns, which are of practical and theoretical importance. We establish several properties of these penalty functions and discuss some examples where they can be computed explicitly. Moreover, for solving the regularised least squares problem with these penalty functions, we present a convergent optimisation algorithm and proximal method. Both algorithms are useful numerical techniques taylored for different kinds of penalties. Extensive numerical simulations highlight the benefit of structured sparsity and the advantage offered by our approach over the Lasso method and other related methods, such as using other convex optimisation penalties or greedy methods.
منابع مشابه
A Family of Penalty Functions for Structured Sparsity
We study the problem of learning a sparse linear regression vector under additional conditions on the structure of its sparsity pattern. We present a family of convex penalty functions, which encode this prior knowledge by means of a set of constraints on the absolute values of the regression coefficients. This family subsumes the l1 norm and is flexible enough to include different models of sp...
متن کاملDifference-of-Convex Learning: Directional Stationarity, Optimality, and Sparsity
This paper studies a fundamental bicriteria optimization problem for variable selection in statistical learning; the two criteria are a loss/residual function and a model control (also called regularization, penalty). The former function measures the fitness of the learning model to data and the latter function is employed as a control of the complexity of the model. We focus on the case where ...
متن کاملRecovering sparse signals with non-convex penalties and DC programming
This paper considers the problem of recovering a sparse signal representation according to a signal dictionary. This problem is usually formalized as a penalized least-squares problem in which sparsity is usually induced by a l1-norm penalty on the coefficient. Such an approach known as the Lasso or Basis Pursuit Denoising has been shown to perform reasonably well in some situations. However, i...
متن کاملA unified perspective on convex structured sparsity: Hierarchical, symmetric, submodular norms and beyond
In this paper, we propose a unified theory for convex structured sparsity-inducing norms on vectors associated with combinatorial penalty functions. Specifically, we consider the situation of a model simultaneously (a) penalized by a set-function defined on the support of the unknown parameter vector which represents prior knowledge on supports, and (b) regularized in `pnorm. We show that each ...
متن کاملSMOOTHING PROXIMAL GRADIENT METHOD FOR GENERAL STRUCTURED SPARSE REGRESSION By
We study the problem of estimating high dimensional regression models regularized by a structured sparsity-inducing penalty that encodes prior structural information on either the input or output variables. We consider two widely adopted types of penalties of this kind as motivating examples: 1) the general overlapping-group-lasso penalty, generalized from the group-lasso penalty; and 2) the gr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012